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Abstract: Exchangeable sequences of random probability measures (par- 
titions of mass) and their corresponding exchangeable bridges play an im- 
portant role in a variety of areas in probability, statistics and related areas, 
including Baycsian statistics, physics, finance and machine learning. An 
area of theoretical as well as practical interest, is the study of coagula- 
tion and fragmentation operators on partitions of mass. In this regard, an 
interesting but formidable question is the identification of operators and 
distributional families on mass partitions that exhibit interesting duality 
relations. In this paper we identify duality relations for a large sub-class of 
mixed Poisson-Kingman models generated by a stable subordinator. Our 
results are natural generalizations of the duality relations developed in Pit- 
man [23], Bertoin and Goldschmidt [2], and Dong, Goldschmidt and Mar- 
tin [7], for the two-parameter Poisson Dirichlet family. These results are 
deduced from results for corresponding bridges. 
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1. Introduction 

Exchangeable sequences of random probabilities living in the space V — {s = 
(si, S2, . . .) : si > S2 > • ■ • > and s * = 1}> an d corresponding exchange- 

able random probability measures on [0, 1], defined as 

oo 

p(p) = j2 p M i < P ), (i-i) 

k=l 

where (Ui) are iid Uniform[0, 1] variables independent of (Pi) € V ', play an 
important role in a variety of areas in probability, statistics and related areas, 
including Bayesian statistics, physics, finance and machine learning. Some ref- 
erences, are as follows [5, 8, 9, 6, 11, 14, 27, 16, 17, 24]. Our primary references 
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in this paper will center around applications to coagulation/fragmentation phe- 
nomena. For a general summary of some of these applications, and for the con- 
cepts and notations we use in this exposition, we refer to the monographs [1, 22], 
and also [4]. 

One of the most interesting examples in the literature is the two-parameter 
Poisson-Dirichlet family of laws on V, say PD(a,(9), indexed by < a < 1 and 
9 > —a, as discussed in [26]. The corresponding PD(a, 0)-bridge, denoted as 
Pa, eip), is the random distribution function defined by setting (Pi) ~ PD(a, 9) 
in (1.1). The Poisson-Dirichlet (a, 9) family arises in connection with the lengths 
of excursions of Bessel processes and often appear, in some guise, in the study 
of phenomena involving positive a-stable subordinators and/or gamma subor- 
dinators. These processes also play an important role in Bayesian statistics and 
machine learning. See Bertoin [1] for applications to coagulation/fragmentation 
phenomena and Ishwaran and James [11] [see also Pitman [24]]for applications 
to Bayesian statistics, where in particular P a ,g is referred to as a Pitman- Yor 
process. Under this name the process has also been applied to problems arising 
in natural language processing, see for instance [28, 30, 31]. In fact as shown 
explicitly in [30], these methods are working with coagulation/fragmentation 
operations at the level of the Poisson Dirichlet random probability measures 
(bridges). They show these connections lead to a significant reduction in the 
complexity of an oo-gram natural language model. When 9 > and a — Po,e 
is a Dirichlet process made popular by Ferguson [8]. 

In regards to general (Pi) GV an interesting question arising in the study of 
coagulation and fragmentation processes [1, 22] is as follows. For X, Y random 
exchangeable sequences in V, describe in an informative way the conditional 
distribution of X\Y and Y\X. Naturally X and Y should also have some inter- 
esting interpretations. We also note that it is not necessarily the case that both 
laws X and Y are initially known. This is the essence of what is known as a 
coagulation-fragmentation duality, and is generally a difficult problem. Gcneri- 
cally this duality can be read using the following diagram for X, Y in V, 

X\Y 



Y\X 

Pitman [23] was able to derive a remarkable duality formula for certain mem- 
bers of the PD(a, 0) family, where in particular he describes the relationships be- 
tween X ~ PD(a5, 9) and Y ~ PD(a, 9) for < 8 < 1. This relationship acts in 
a multiplicative fashion on the first component. The coagulation/fragmentation 
duality in Pitman [23] may be described in terms of the following diagram as 
given in [22]; for < a < 1, < 6 < 1, 6 > -aS, 

PBj S, |) - C oag 
PD(a,0) , ' PD(a<5,6>) (1.2) 

PD(a, -aS) - Frag 
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More recently, using the PD(0, 9) family, Bertoin and Goldschmidt [2] describe 
an additive duality relationship where X ~ PD(0, 9) and Y ~ PD(0, 1 + 0). This 
additive duality is generalized to the PD(a, 9) family in Dong, Goldschmidt, and 
Martin(DGM) [7]. Their results can be represented as follows, for 9 > —a, and 
< a < 1, 

/? ( (i- a ) ^e+ou - Coag 
PD(a,l + 0) ( ' PD(a,0) (1.3) 

Frag - PD(a, 1 - a) 

We will give a precise meaning of the Coag/ Frag operators later. 

In general, it is not clear how one can obtain similar results for other (a, 9) 
parameters values or other families in V. In this paper we, using results we 
develop for bridges, identify a large class of laws on V where explicit duality 
relations exist. These can be seen as natural extensions of the results in [23, 2, 
7] . The class represents a sub-class of Poisson-Kingman mixtures generated by 
stable subordinators that we denote as having laws P a (C), where £ denotes a 
non-negative random variable. We describe more details of this class as well as 
relevant result for more general processes in the next section. 

2. Exchangeable bridges and partitions 

Following Bertoin [1, Definition 2.1, p. 67], (see also Pitman[22, section 5]), an 
infinite numerical sequence s = (si,S2, ■ ■ ■) is said to be a mass-partition if s 
is an element of the space, 

oo 

Vm = {s = (sx,S2,...) : si > s 2 > ••• > and ^s, < 1}. 

i=l 

The quantity 

oo 

so := 1 - /J Si, 

i=l 

which may be 0, is referred to as the total mass of dust. From Bertoin ([1], 
Definition 4.6, p. 191), a random caglad process b s on [0,1] is said to be an 
s-bridge if it is distributed as 

oo 

b s (y) = s Q y + ^2 s * I (c< <v) ' y e t ' > 
fe=i 

for (Ui) a sequence of iid Uniform[0, 1] random variables. If s ~ P, i.e. if s is 
randomized according to some law P, then b s is said to be a P-bridge. It follows 
that V is a subspace of P m such that s » ~ 1- Furthermore, for all s G P m , 

Rank(so,.s) G V '. Hence we see that the random probability measure in (1.1) 

is a P-bridge , with (si) = (Pi) G V distributed according to some law P with 
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so = 0. An important property, which we shall exploit, is that the law of the 
P-bridge is in bijection to the law of the sequence (P,;) ~ P. Additionally let 

b~ l {r) = inf{w G [0,1] : b s (r) > r}, r G [0,1] 

denote the right continuous inverse of the bridge. Equivalently this is a random 
quantile function. An exchangeable partition of [to] = {1,2, ... ,n} generated 
from an exchangeable bridge, say b s , can be obtained by the equivalence relations 

i~jiff b- 1 (Ul) = b- 1 (tf) 

based on n iid Uniform[0, 1] variables (U[, . . . , U' n ). An infinite partition, II of N 
is formed by considering a countably infinite set of uniforms. The distribution 
of such infinite exchangeable partitions is referred to as an exchangeable parti- 
tion probability function (EPPF). We see that the random probability measure 

in (1.1) is a P-bridge , with (s,) = (Pj) G V distributed according to some law 
P with so = 0. An important property, which we shall exploit, is that the law of 
the P-bridge is in bijection to the law of the sequence (P;) ~ P, and also to the 
corresponding EPPF , specifying the law of the exchangeable partition II with 
ranked frequencies (Pi), which we shall refer to as a P-EPPF. In this manuscript 
we will also utilize properties of simple bridges. In particular if s = (u, 0, . . .) 
is a simple mass-partition then b u (y) = (1 — u)y + v$.mi<y) 1S referred to as 
a simple bridge. If u = si is a random variable then one has a randomized 
simple bridge given by, 

b Sl (y) = s oV + si\ui< y ) (2- 1 ) 

2.1. Poisson Kingman distributions determined by a stable 
subordinator 

Recall from Pitman [20], that for < a < 1 a sequence (Pi) has a Poisson- 
Kingman law generated by a a-stable subordinator with mixing distribution 
77, say PK q (t7), if its law can be constructed as follows; Let (Jj) denote the 
ranked jumps of a stable subordinator such that T = £\=i Jk is equivalent 
in distribution to a positive a-stable random variable, with density denoted as 
f a (t) and whose log Laplace transform is given by —Cu> a for some constant 
C > and each u > 0. Hereafter, due to scaling properties, we can take C = 1. 
Set (Pi = Ji/T), then it follows that (Pi) has a PD(a,0) distribution. Denote 
by PD(a|i) the conditional distribution of {Pi)\T = t, then 

/>oo 

PK Q (77) := / PD(a\t)v(dt) 
Jo 

The PD(a, 0) laws arises as a special case by choosing n(dt)/dt proportional to 
t~ e f a (t), which is the density of a polynomially tilted stable random variable. 
The classical Poisson-Dirichlet case, PD(O,0), arises by letting a go to zero in 
an appropriate sense. An important feature of the general PK Q (?y) class of laws 

imsart-generic ver. 2009/12/15 file: ArGibbsBridges.tex date: August 17, 2010 



Lancelot F. James/Coag/Frag Duality 



5 



and its limiting cases, is that as shown in [20, 22, 10], see for instance Pitman[20, 
Theorem 8, p. 14], that these are the only cases where the EPPF of an infinite 
exchangeable random partition II with ranked frequencies (Pi) has Gibbs form. 
Additionally, we will make use of the following fact, if T is a random variable 
with distribution 77, then, from Pitman[20, Proposition 13, p. 20], S = T~ a is 
the a-DIVERSITY of the PK Q (?7) partition. That is, if K n denotes the number 
of distinct blocks of a PK Q (?7)-EPPF partition of [n], then K n /n a converges 
almost surely to S as n converges to 00, and almost surely, 

T = S- X ' a := lim (iT(l - a)P,y 1/a 

i— > 00 

In other words S and T are completely determined by the corresponding (P,) 
sequence. 



2.2. The P Q (C) family 

As mentioned in the introduction, in this paper we show that one can extend the 
results of Pitman [23] and Bcrtoin and Goldschmidt [2], Dong, Goldschmidt, and 
Martin(DGM) [7] to a large class of processes whose P law is given by PK Q (?7*), 
where rf belongs to a class of mixing distributions corresponding to random 
variables of the form, 

rp d_ T a (C) 

£ is a non- negative random variable taken independent of (r Q (s), s > 0), which 
is a generalized gamma subordinator whose Levy exponent, i.e. its -log Laplace 
transform of r a (l), is given by 

^«(w) = (1 + u)) a - 1 (2.2) 

for to > 0. The conditional density of T|£ is given by 

and the hence the density of T can be expressed as, 

V *(d S )/d S = f a (s) / e-^ 1/a -y^F c (dy) = f a (s)E[e-^ 1/a ^} 
Jo 

where denotes the distribution function of £. Hence, if £ is random, a condi- 
tional distribution of C\T = s is specified by 

F Ca (dy\s) oc e-^'-^J^Cdy). (2.3) 

It follows that, for fixed a, the law of (Pi) ~ PK a (?7*) varies according to the dis- 
tribution of C, and hence we denote this law as P Q (C) : — PKq( ? 7*)- Importantly, 
the corresponding P a (^)-bridge can be written as 

Qa,dv) = =T,Pi^V t <y),V G [0, 1] (2.4) 
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where, (P<) ~ P„(C). 

This construction of P a (£) laws coincides with random processes discussed 
in Pitman and Yor [26, p. 877-878], which is used to prove Pitman and Yor [26, 

Proposition 21, p. 869]. Proposition 21 of that work shows that if £ = 7# i a where 
70 i a denotes a random variable with a gamma distribution with shape parameter 
(9/a), and scale 1, then f a {le/ a ) = PD{a,9) for 9 > 0. Note this does not 
include the case of PD(a,6») for -a < 6» < 0. However P a (0) = PD(a,0). 
Furthermore when £ = b is a positive constant, P Q (&) corresponds to the case of 
the Poisson-Kingman model determined by the generalized gamma subordinator 
as described in Pitman [20, Section 5.2]. In this case the bridge Q a ,b has been 
studied from a Bayesian perspective in [13, 18, 19, 15]. However it is evident 
that, due to the generality of £, the class of P Q (£) laws is significantly larger 
than the special cases mentioned. 

In order to establish our results we will work directly with P a (£)-bridgcs, 
Q a .Q- In fact, we will show that working with Q a _Q is rather transparent in 
terms of identifying which laws on V are related in the sense of Coag-Frag op- 
erators. While the operators we discuss are of similar type to those in [23, 2, 7], 
we cannot rely on the fine properties of the PD(q, 9) family utilized by those 
authors. For example, one can show that the coagulation operators in [23, 2, 7] 
are in bijection to the operation of composition of independent bridges. The dual 
relationship between compositions of independent bridges and coagulation oper- 
ations can be found in the works of Bertoin and Le Gall, [1, 3, 4] and Pitman [22, 
Lemma 5.18]. Our coagulation operations will be defined via the compositions 
of generally dependent bridges. Nonetheless, for a given input sequence {pi), we 
are able to give good descriptions of the conditional distribution of the relevant 
coagulation operator, which as we shall show reduce to conditional distributions 
given the DIVERSITY or local time determined by the input sequence. We will 
also show that the dual fragmentation operators are exactly the same as those 
used in [23, 2, 7], where, in contrast to the coagulations operators, our inputs are 
indeed independent of the respective PD(a, — ad) and PD(a, 1 — a) fragmenting 
variables in V . 

3. Pitman style coagulation and fragmentation operations for P a (£). 

In order to establish an analogue of (1.2) for the P Q (C) family of laws we first 
identify an appropriate coagulation operation. As we mentioned in the previous 
section there is a close relationship between the notion of coagulation operators 
on V , or in terms of corresponding exchangeable partitions, and the idea of 
composition of bridges, say 

00 00 

Fxiv) = Y. p k\v k <y) and F ^y) = £ p i 2 V<,), 

k=l k=l 

where (V&) and (Uk) are independent sequences of iid uniform variables, and 
independent of these, {P^) and (i^ ) have marginal laws on V denoted as P^ 1 -* 
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and P^ 2 ). For instance, following Bcrtoin [1, Section 4] a coagulation operation 
on partitions can be denned by the composition 

oo 

F 3 (F 1 {y)) := F 2 o F 1 (y) = £ P fc (2) \ F -^ Uk) < y) 

k=l 

in terms of partitions induced by the relation 

i ~ j iff pr 1 ° Fjf 1 ^) = pr 1 ° ^r 1 ^) (3-i) 

based on n iid Uniform[0, 1] variables (U[, . . . , E/^). If viewed in stages, one first 
creates a partition following the law of the P( 2 )-EPPF associated with F 2 by 
the relationship 

i~jiff F 2 - l {U' i ) = F 2 -\U' j ). 
Given the partition of [n], say {Pi, . . . , P^(2)}, induced by this operation with 

(2) 

K n = k unique blocks, there are JTj*, . . . , U£ distinct iid uniform variables as- 
sociated with the k blocks with labels {1, . . . , k}. The blocks are further merged 
by the relation, i.e. merge Bi and Bj according to, 

i ^. ? iffp 1 - 1 ( L /;) = p 1 - 1 (c/;) 

From Pitman[22, Section 5, Lemma5.18] the corresponding Coag operator on 
V, which includes the Coag operator in (1.2) is defined as follows. Let (ij ) 
denote the interval partition of as described in [22, p. Ill] induced by a 
PW -bridge then for (P ? (2) ) - P {2 \ it follows that 

Rank ^E^ (2) V. e /f , )' J - 1 ) 
is equivalent in distribution to the sequence in V induced by the composition 

(2) 

of bridges F 2 o F\. Hence under these specifications, setting {P> ) = (p,-), the 
Coag operator (P^ — Coag)((pi), •) is the distribution of 

/ oo \ 



Rank I Z^Pi\ Ui&I -rWy3 > 1 



In the literature it is usually assumed that the sequences (P^), {P^) are in- 



dependent, which would mean that the interval partition (Jj" ' ) is independent 

(2*) 

of (P> ). In terms of the relation (3.1) this means that the merging of blocks 

(2) 

in the second stage only depends on the number of blocks K n ' = k and is oth- 
erwise conducted independently with respect to a P' 2 '-EPPF. This is case for 
the operator defined in (1.2). However it is clear, working with the explicit con- 
structions of Pi and P2, and using the relation (3.1), that (P^ 1 ' — Coag)((pj), •), 
coagulation operators induced by possibly dependent sequences (Pj ) 5 (Pj ) 
still makes sense except now its distribution is a bit more complicated. 

We now show that the relevant coagulation operator for the P Q (C) class is of 
this form, but despite this extra dependence we will be able to show that its 
distribution can be described quite clearly. 
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3.1. Compositions of P a (C) -bridges and resulting Coag operators 

For < a < 1 and < S < 1, let r Q and t$ denote independent generalized 
gamma subordinators with laws specified by (2.2), where in the second case a 
is replaced by S. Then for a common random variable C, define bridges 

n / \ T s((y) , n r a (T S (C)y) . . 

Qs,dv) = -rjTr and QcxMOiy) = ~ 7Z ?7vi ( 3 - 2 ) 

T <5(,0 Tq(^(C)) 

If C = Je/(6a)j then Tj(^) = 7e/ Q and it can be deduced from Pitman and Yor 
[26, Proposition 21, p. 869, and p.877-878], that Q sx is a PD(£, (9/a)-bridge 
and Q a ,T S {Q is a PD(a, 0)-bridge. When £ = 0, the bridges reduce to the case 
of PU>(<5, 0) and PB(a, 0) We can deduce further from Pitman and Yor [26, 
p.877-878] that these are the only cases where the bridges Q a ,T S (() an d Qs,c arc 
independent. Nonetheless, it is obvious by construction that the composition of 
these bridges yields 

Q a ,T 5 (o(Qs,c(y)) =Qc t 8,dv)- ( 3 -3) 

Note that we use the fact that T a (Ts(()) = T a $(Q. Hence, this allows us to write 

P 5 (C)-Coag 

P«M0) ► P Q «(C) (3-4) 

where an initial description of P<s(C) — Coag is given in the next proposition. 

Proposition 3.1. Considering the bridges in (3.2), let > 1) for P = 

P<s(£)) denote the interval partition induced by the ¥s{C)-bridge Qsx- Writing 



Q a , TS (o(y) = J2 Pklu *< 



y> 

k=l 



it follows that the marginal distribution of the sequence (Pk) ~ Pos(t,5(C)), but is 
not in general independent of the Pa(C) interval partition > 1)- 

(i) However, from (3.3) it follows that 

Rank r£ PfcV^jj), j > lj ~ Pai(C) (3-5) 

(zi^ Hence setting (Pk) = (pfe), f/ie P<s(C) — Coag((pfc), •) is the distribution on 
V of 

Rank ^5Z Pfc V*e/J))J ^ ^ (3- 6 ) 

where the conditional distribution of theVs(C) interval partition > 1) 

</w>en (Pfe) = (pfe) is no£ independent of (pk), and is otherwise determined 
by the constructions in (3.2). 
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The description of the distribution of the Ps(C) ~ Coag operator in statement 
[(ii)] is rather vague. We now provide a much better description. In regards to 
the bridges defined in (3.2), set 

T 1 = T 4£ and T 2 = (3.7) 

Tf 5 is the ^-DIVERSITY of V S (Q and T 2 ~ a is the a-DIVERSITY of P a (r 5 (C))- 
Hence they are completely determined given realizations from the respective 
Pi(C) and ¥ a (rs(()) sequences in V. Now setting T\ = s it follows that 

T a (C 1/s s) r a (C 1/5 sy) 

= and Q ^o(v) = Tq(C i/, s) ( 3 - 8 ) 

Applying Bayes rule a conditional density of Xi|T2 = v, £ is given by 
A conditional density of T2IC, is 

/• CO 

/2HC) = /«(«) / e-< 1/C " 5)sl/ % c / 5 ( S )d S 



Hence it follows that a conditional density of Ti | T2 = w is given by 

rt v) {ds)/ds oc / (5 (.s)E[c-' uCl/(Qi)sl/Q e <: ] (3.9) 
Now using Pitman and Yor [26, p. 877-878], gives the following result. 

Theorem 3.1. Consider the setting in Proposition 3.1, with dependent bridges 
defined by (3.2), and the associated variables T\ andT2 defined by (3.7). Then, 
for the sequence (Pk) whose marginal follows a V a (rs(C)) distribution, set Pk = 
(pi )> where this indicates that the particular realization (pj^) corresponds to 

T 2 = v. Then the distribution of the P,s(C) — Coag((pi ), •) given P^ = (p£ ) is 
equivalent to the distribution of 

where for fixed (pj^), (if ),j > 1) is equivalent in distribution to a 
interval, with 

P OO 

= PKa(T?^) := / PB(S\s)r, {v) (ds). 
Jo 

That is, the conditional distribution of the P<s(C) interval partition given (pj^) 
only depends on T2, and equates with the interval partition of a Poisson-Kingman 
law generated by a S-stable subordinator with mixing distribution T]^ v ' defined 
in (3.9). Equivalently the conditional distribution of the marginally Pa(C) -bridge 
constructed in (3.2), given (pi ), is equivalent to a PK$(r]( v ^)-bridge. 
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Proof. Noting (3.8), it follows from Pitman and Yor [26, p. 877, see eq. (96) and 
(97)], that the bridges Qs,c, Qa,T S {Cji are conditionally independent given T\ = s 
and T2 = v, and £, and have PD(<5|s) and PD(a|w) distributions respectively. 
Hence the conditional distribution of the bridge Qsx given (P&) = (p^), equates 
with the conditional distribution of Qsx given Ti = v. Which is obtained by 
finding the conditional density of T\\T2 = v. □ 

In the next result, we show that the construction of the bridges in (3.2), and 
the results discussed in Pitman and Yor [26, p. 877], identifies a coagulation 
operation expressed in terms of conditionally independent processes. 

Theorem 3.2. Consider the bridges defined by (3.2), and the associated vari- 
ables T\ and T2 defined by (3.7). Then, 

(i) conditional on T\ = s, the bridges Qs.( and Q a>Ts (c,) are conditionally 
independent. 

(ii) In particular, given 7\ = s, Qs,q has the distribution of PD(<5|Ti = s)- 

bridge not depending on Q. 
(Hi) Conditional on 7\ = s and Q = b Q a ,rs{Q * s a ^ai^ 1 ^ s) -bridge. That is 

to say a generalized gamma bridge, 
(iv) Conditional on T\ = s, Q a ,T S (C) * s a Pa(C 1 ^' 5 )-& r ^ffe- Where the law of ( 

depends conditionally on T± = s, and is specified by F^ t s('\s) defined in 



(v) In reference to the Ps(C) — Coag((/?fc), •) defined by 3.10, it follows that 
conditional on T\ = s,and(Pk) = (j>k) the distribution of the Pa(C) in- 
terval partition {lj,j > 1) does not depend on (Pk) and is equivalent in 
distribution to a PD(<5|Ti = s) interval partition. Conditional on T\ = s, 
the sequence (Pk) follows a generalized gamma law P a (C 1/S s) 

Proof. Noting (3.8), it follows that the bridge Q a . TS (Q can be expressed in terms 
of some function of the variables (t q ,Ti,£) where r Q is independent of the pair 
(Ti,C) and also Q S £. From Pitman and Yor [26, p. 877, see eq. (96) and (97)], 
it follows that Qs.( conditioned on T\ = s is conditionally independent of Q, and 
has the law of a PD(<5|Ti = s)-bridgc. These points establish statements [(i)] 
and [(ii)]. Statements [(iii)] and [(iv)] easily follow from the explicit construction 
of Q a ,T 6 (Q given in (3.8). Statement [(v)] follows as a consequence of statements 



The result shows that by conditioning on T x = s, where T{ s is the (5-DIVERSITY 
corresponding to P,5(C), that the composition of dependent bridges described 
in (3.3), can be first expressed in terms of the composition of conditionally 
independent bridges, all of which depend on a parameter s. Call a bridge a 
Pq aCO'bridge if its law is equivalent in distribution to the conditional distribu- 
tion of Q a ,rs{C) Q$X given T\ = s. Then there is the following relation, 



2.3. 



[(i)] to [(iv) 



□ 



PD(5|s) - Coag 




(3.11) 
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where relative to (3.11), for an input (Pk) = (pi ) from a P Q (C 1 ^' 5 s) sequence in 
V the distribution of the Coag operator PD(<5|s) — Co&g((p^), •) is equivalent 
to the distribution of 

Rank (j>2Pk\ Uk&I jv(t\s) r J > l \ ■ 

where now (jJ D ^^ s ' ) ) denotes a PD((5|s) interval partition that is independent 

of the input sequence (Pk) = (?4 ) but otherwise they depend on a common 
parameter s. It follows that the relation (3.4) arises from (3.11) by randomizing 
s~ 5 according to the law of the (5-DIVERSITY of P s ((). The diagram in (3.4) 
can hence be expressed in terms of random partitions on [n] as follows. Step 1. 
Draw a variable S having the law of the S— DIVERSITY of a P<s(C) exchangeable 
partition. Step 2. Setting S^ 1 ^ 5 = s, form a random partition {B>i, . . . , -Br-„} of 
[n] according to a P Q (C 1 / ,5 s)-EPPF. Step 3. Merge these K n blocks according 
to an independent PD(<5|s)-EPPF. This scheme produces a random partition of 
[n] according to a P Q(5 (C)-EPPF. 



3. 2. Fragmentation 

From Pitman [22, p. 112], for an input (Pi) = (p{) a fragmentation operator 
P — Frag((pi), •) is defined as the distribution of 

Rank^Qjj, i,j > 1). 

where (Qi,j)j>i has distribution P for each i, and these sequences are inde- 
pendent as i varies. In other words one splits the input (P,) multiplying each 
term by an independent sequence of elements in V having common law P. In 
the case of (1.2), the input has a PD(aS, 8) independent of the {Qi,j)j>i hav- 
ing common law P = PD(a, — aS). In this section we will show that the same 
fragmentation operator PD(q, — aS) — Frag((pi), •), applied to independent in- 
puts (Pi) having law P a ^(C) gives the coagulation fragmentation duality for the 
P Q (C) class that generalizes (1.2). Again we note that, unlike the coagulation 
operators discussed in the previous section, the input (Pi) is independent of the 
(Qi,j)j>i, which agrees with the formulation in [23]. Nonetheless the validity of 
such results is not immediately obvious. In order to do this we first express these 
fragmentation operations in terms of an equivalent distributional relationship 
involving bridges. In particular, let (P^_ a g) denote a collection of independent 
PD(a, —aS)— bridges. Then it is known that the fragmentation results in [23] 
can be read in terms of the distributional equivalence of bridges, for y in [0, 1], 

oo 

p a .o(y) = Y. p ^%s(v) 

k=l 

where (Pk) follows a PD(aS, 8), distribution. The next result extends this to out- 
setting. 



imsart-generic ver. 2009/12/15 file: ArGibbsBridges.tex date: August 17, 2010 



Lancelot F. James/Coag/Frag Duality 12 

Theorem 3.3. Let (Pk) have law P a s(C) chosen independent of a sequence 
(P^-as) of independent PD (a, —ad) — bridges constructed from the collections 
of independent PD (a, —aS) sequences (Qk,j)j>i- Then, 

(i) for Q a ,T S {Q a ¥ a (rs(C)) -bridge there is the distributional equivalence 

oo 

Qcr s{0 (y) = Y, p ^ k - a s(y) (3-12) 

fc=l 

(ii) Hence, unconditionally, PD(a, — aS) — Prag((Pj), •) has distribution, 

Hank(PiQi tj ,i,j > 1) ~ P Q (r 4 (C)). 

Proof. Since the bridges are exchangeable, it suffices to verify (3.12) for some 
fixed y. For each fixed y, let if ^ denote the distribution function of the random 
variable P a ,-as(y)- Let Q a 8,c denote a P Q ,5(£)-bridge, and define the random 
probability measure = Q a s,c ° H^ v \ i.e. 

oo 

fc=l 

It follows that for each fixed y, (not path- wise), that 

But Q(»)(«) = r Q(5 (Cff to) ("))/^5(C^ ( ' y) (l)) I where ffW(l) = 1. Hereafter set 
Tas(C,H ^(u)) := r^ v '(u) Now recalling the construction of Q a ,T 6 (C){y) as in 
(3.2), it follows that (3.12) is verified by showing that, 

(T„M0),T„(7*(C)y)) = (r^(l), [ ur^{du)) (3.13) 



This will be done by establishing the equivalences of their joint Laplace trans- 
forms at positive points (wi,^)- Notice that 

WlT a (T{(C)) + W2T a (T 5 (C)y) = / [Wi + W 2 I( u < a )]T Q! (T 5 (C)dM) 

Jo 

Similarly, 

UJ 1T [V) {I) + L0 2 [ UT (v) {du)= f [UJ! + UJ 2 u]T {y \du) 

Jo Jo 

Conditioning on ts(Q, it follows using standard results for linear functionals 
of positive Levy processes that the -log joint Laplace transform of the the left 
hand side of (3.13) is given by 

T tf (C)E[(l + wi+a*I (£ ,< tf) ) a -l] 
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yielding, for fixed (, 

C[(y(l + + u 2 ) a + (l - y )(i + ^r) 5 - 1]. 

Conditional on (, the joint -log Laplace transform of the right hand side of (3.13) 
can be expressed as 

CE[(1 + ui + uJ 2 P a ,~ a s(y)) aS - 1], 

but 

LOi + CJ 2 P a ,~as{y) = / [Wl + LJ 2 I( u <y)]P a - a s{du). 

Jo 

Furthermore, it is known from [29], that for M := g(u)P ai - a $(du), for some 
positive function g, that 

E{(l + M) a5 ] = (E[(l+g(U)r}) 5 . 

Setting g(u) = u>i + Lo 2 \ u < y ) 1 it follows that 

E[(l + wi + W2 P Qi _ Q5 (y)) Q5 ] = (y(l + wi + a*) + (1 - y)(l + wi) Q ) 4 
concluding the result. □ 

3.3. Duality 

We can now describe the duality relation in terms of the following diagram 

Pj (Q - Co ag 

Pa(r«(0) ( ' Pa«(C) (3-14) 

PD(a, -a<J) - Frag 

It follows that for > 0, (1.2) arises by choosing ( = ^fe/{aS)- We close with a 
formal statement. 

Theorem 3.4. Suppose that X and Y are sequences in V . Then, using the 
descriptions in Theorems 3.1 and 3.3, the following statements are equivalent. 

(i) X ~ P a< s(C) and conditional on X, Y = PD(a, — aS) — Frag(X, •). 
(ii) Y ~ P q (t 5 (C)) and conditional on Y, X ~ ¥ s {() - Coag(F, ■). 

Where in particular, forY = (pj^), indicating its a— DIVERSITY or local 
time has the value v~ a , 

Ps(C) - (Coag^), •) i PK S ( V ^) - (Coag((p^), ■)• 

Where on the right hand side the PKs(n^) sequence and the input se- 
quence are conditionally independent. 



imsart-generic ver. 2009/12/15 file: ArGibbsBridges.tex date: August 17, 2010 



Lancelot F. James/Coag/Frag Duality 



11 



4. DGM type coagulation fragmentation duality for the P a (C) class 

We now proceed to establish generalizations of the coagulation fragmentation 
duality, (1.3), described in Bertoin and Goldschmidt [2], Dong, Goldschmidt, 
and Martin [7] . In order for us to identify the appropriate generalization of this 
duality, wc first look at the fragmentation operation. 



4-1. Fragmentation 



The fragmentation operator Frag — PD(a, 1 — a) can be defined generically as 
follows. For an input sequence (Pi), splitting of this sequence is achieved by 
attaching an independent PD(a, 1 — a) sequence, say (Qi), to the sized biased 
pick of (Pi), say Pi and then ranking the modified sequence. Hence for the 
fixed input (Pi) — (pi), let (p* k ) denote the sequence remaining after the size 
biased pick p\ is removed from (pi), then Frag — PD(a, 1 — a)((pi), •) has the 
distribution equivalent to 

RankG^Q;),^)). (4.1) 

In terms of (1.3), the input follows a PD(a,6) distribution and hence it is 

known that the distribution of the size biased pick Pi = Pi- a ,e+a- This can be 
expressed in terms of the following distributional equivalence that can be found 
in [21, 24, 26], see also [12] for more details and references, 

-\-ct, 1 — a P a ,e+ a (y) + (1 - (3 9-\-a,l — a )l {Ul < y) , (4.2) 

where the variables on the right hand side are independent. In this case, the 
distributional result for the fragmentation operation can be verified by the fol- 
lowing result 

P a ,i+e(y) = Pe+ a ,i- a P a ,e+ a (y) + (1 - Pe+a,i-a) P a,i- a (y), (4-3) 

Similar to the case of the Pitman's PD(a, — aS) fragmentation operator that 
we applied to the P Q (C) class in the previous section, we will show that the 
Frag — PD(a, 1 — a) operator is natural to use in this present setting. In order 
to identify the appropriate distributional relations wc will need to establish 
generalizations of equations (4.2) and (4.3). 

Theorem 4.1. Let Q a ,( denote a P a (C) -bridge. 

(i) Then, 

n , , d r a ((y) d r a (( 7 i +Qy) +7i-cJ (C / 1 < y ) (A 
T a (C) ^(71+0+71-0 

This can be written as, 

Qa,dv) = (1 - A)Qc*, 7 i+c(p) + Al(t/i<y) 
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where the size biased pick from fron a P Q (£) sequence can be represented as 

Pi = 7l-a/( T a(7l+C)+7l-a): and Qa ni +C iv) = T a (ill + Ov) / T a {ll + C) 

is a P Q (7i + C,)-bridge generally not independent of P\. 
(ii) Now replacing ^(Ui<y) by an independent PD(a, 1 — a)-bridge it follows 
that a P a (7i/a + C) — bridge can be represented as 

Q a , 11/a +dv) = (! - A)Q« m +c(y) + Ai^i-afo). ( 4 -5) 

(Hi) Furthermore, in terms of its marginal distribution, the size-biased pick Pi, 
which has distribution equivalent to the structural distribution of a P Q (C) 
sequence, can be represented as 



p — R ll — R 

r l — Hi — a. a , />\ — Hi— a, a 

' 7i+r Q (C) 



(4.6) 



(C + 7i) lA 

where the variables appearing on the right hand side are independent. 

Proof. Similar to [24] for (4.2), statement [(i)] can be established by a Bayesian 
argument. Let X\ denote a variable so that conditional on Q a x its distribu- 
tion is Q a x- Then noting that conditional on £, Q a .Q is a generalized gamma 
bridge, it follows that the posterior distribution of Q a ,( given X\, C can be read 
from James, Lijoi and Priinster [15, Proposition 1, Theorems 1 and 2]. By scal- 
ing arguments, involving properties of the r Q subordinator, it follows that the 
unconditional distribution of Q a ,c can be represented as, 

n , v _ T a {(y) d T a (((l + \) a y) + ji- a I (Ul < y) 
^ (V) ~ r a (C) ~ r Q (C(l + AD+ 7l _ Q 

where A is a variable, appearing in James, Lijoi and Priinster [15, Proposition 
1] for n = 1, equal in distribution to 7i/t q (£). From this, it is not difficult to 
see that the distribution of (A, £) is given proportional to 

F c (dx)xe- x[{1+x)a - 1] (l + X)"- 1 
Manipulating this distribution easily shows that 

A^C 1/Q (7i+C) 1/o -l = 

T a(C) 

The identification of the size-biased pick is a consequence of the Bayesian ar- 
gument. For statement [(ii)], note that since 71— a = T a (7(i_ a )/ a ) and is in- 
dependent of P a ,i- a (y), representable as r a ,(7( 1 _ Q ,)/ a y)/r a (7( 1 _ a )/ Q! ), it follows 
that 

T a {i{i- a )/ a y) 



PiP a ,i- a {y) 



T a {ll + C) + Tq(7(1 —a) /a) 



d 



Pushing terms together and using the fact that 7(i_ a )/ a +71 = Ji/ a completes 
the result. Statement [(iii)] follows from standard beta-gamma algebra and the 
results we discussed above. □ 
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It is evident that the results in (4.4) and (4.5) leads to the validity of the 
fragmentation operator. 

Theorem 4.2. Let (Pi) := (P 1: (P k )), where Pi is obtained by sized biased 
sampling, denote a P Q (£) sequence and let (Qi) denote an independent PD(a, 1— 
a) sequence. Then Frag — PD(a, 1 — a)((Pi), ■), defined by (4-1), satisfies 

Rank(P 1 (Q i ), (P fe *)) ~ P a ( 7l/a + C). 

Remark 4.1. The description of the structural distribution ofP a (Q given in 
(4-6) is new. When £ is a constant the result gives an explicit description of the 
structural distribution of a generalized gamma process that is discussed in [20, 
p. 15]. 

4-2. Coagulation via simple bridges 

The fragmentation result now shows us that we need to find a coagulation 
operation such that when the random input is a P Q (7i/ Q + C) sequence in P, the 
resulting distribution of the operator is V a (£) . We first show that one can provide 
a generalization of the coagulation operator defined in Dong, Goldschmidt, and 
Martin(DGM) [7] using simple bridges. 

We can describe this type of operator through the inverse of simple bridges. 
Recall the discussion on exchangeable bridges where a randomized simple bridge 
b Sl is defined in (2.1). The inverse of a simple bridge is denoted as From 
BcrtoinQl], eq. (4.14), p. 194) one sees that for (U' k )k>i iid Uniform[0, 1] random 
variables independent of U\ , 

Kl{U' k ) = Ui, iff U' k e (sqUu (1 - s ) + soUi), 

having length s\ = 1 — sq and otherwise b~^(U' k ) = U k has an independent 
Uniform[0, 1] distribution, that is for U' k G [0, sqUi] U {s U~i + 1 — s ,l]. More 
precisely, define for each k, 

Ik = I (b7 1 1 (c/0=^) = Vi<»0 ( 4J ) 

then 

nK 1 1 (U k )<y\I k = 0)=y,ye[0,l}. 

Now in general for some exchangeable bridge of the form P(y) = YlkLi Pk\u' <y) > 
one has 

oo 

P(b s Ay)) = Y, PkI ( b 7 1 \U D <y) £ ku 1 <y)[ E E P ^(U«<y) 

k=l k:I k =l {fc:/ fc =0} 

where 

p( Sl )i e p *- 

{k:I k =l} 
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Hence the law of sequence (Qk) S V, such that P o b Sl {y) = J2kLi Qk\u k <y)i 
can be expressed as 

(Qi) = Rank((P fe : 4 = 0), £ Pfe). (4.8) 

{k:I k = l} 

Hence by fixing (Pi) = (pi) in (4.8), for an input (pi) we define the Coag operator 
si — Coag((pi), •) as the operator, 

Sl -Coag(fe),-) :=Rank((p fe :J fe =0), £ p fc ). (4.9) 

{fc:/ fc =l} 

When si — /3(i^ a )/ a ^e+ a )/a, (4-9) coincides with the operator in Dong, Gold- 
schmidt, and Martin [7]. In that case, the beta variable is chosen apriori to be 
independent of the input. The next result shows that we need to choose Si 
generally dependent on the input. 



Theorem 4.3. Let b Sl denote a simple randomized bridge with 

7l/a 
'7l/a + C 



si-^,!)^^ (4.10) 



where the beta variable is taken independent of the independent pair (7i/ Q ,C)i 
Using this same pair define the exchangeable bridge 

M<> +7l/aJ k=1 (7l/a +0 

Marginally Q a ,j 1/c (7i/a + Q-bridge with a-DIVERSITY T~ a . 

(i) Then, for each y € [0, 1] 

Qa,7i/a+c( 6 «i(y)) = Q<x,dv)' 

where Q a £ is a P a (£) -bridge, 
(ii) Then it follows that S\ — Coag((Pfe), •) /ias distribution, 

Rank((P fc : J fc = 0), ^ P fc ) - P a (C) 

{fc:/ fc = l} 

(Hi) Let (Pk) = (pi ) denote the realization such that T = v then the s\ — 
Coag((p^. u '), •) given (p£ ) is equivalent in distribution 

Rank((p^ : if = 0), £ ^ u) ), 
where 1^ = I. <„)., anrfs^ /ias £/ie conditional distribution of Si given 



L k (t/ fc <si l ' , ) 

In par 
Si|T = u. ^ 



(p fe )■ JVi particular the distribution of s|[ equates with the distribution of 



8 M = 0,1=. 
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where, for y € [0, 1], the density of 1 — W^ ' is given proportional to 
(1 - y) 1/Q " 1 2 /- 1 /"- 1 E[e-^ 1/ ° cl/a e c C 1/Q ]- 



Proof. Statement [(i)] follows from the equivalence in (4.4), since it is easy to 
see that, for y e [0, 1], 

fi ( \\ r "((7l +C)V) +7-«(7(l-a]/a%i< 9 )) 

Qa ni/a +c{o Sl (y)) = TTTT 7 n 

and T Q (7 (1 _ Q ) /Q I (!7l < y )) = 7i- Q I( ! 7 1 <y)- [(h)] is immediate from [(i)]. For [(hi)], 
we again appeal to Pitman and Yor [26, p. 877, see eq. (96) and (97)]. That is, 
conditioning on T it follows that (Pk) and s± are conditionally independent. It 
is then straightforward to obtain the conditional density of si given T. □ 

4-3. Duality 

We can now describe the duality relation in terms of the following diagram 



Coag 



P a (7l/a+C) ; ! Pa(C) (4-11) 

Frag - PD(a, 1 - a) 

For # > this reduces to (1.3) by setting £ = jg/ a . Furthermore setting £ = 
7(«-i)/a + Ci m (4-11) leads to a recursion representable as, 

ft(^, "-i+° ) 7 ^+C _ Coa s 

P«(7n/a + j ' P«(7(n-l)/a + 

Frag - PD(a, 1 - a) 

where 

a 7n/a d a 7l/a 

p( , =ita ) 77^7 = h ^ . i) TTTc ' 

In/a ~ S In/a ~ S 

We close with a formal statement. 

Theorem 4.4. Suppose that X and Y are sequences in V . Then, using the 
descriptions in Theorems 4-2 and 4-3, the following statements are equivalent. 

(i) X ~ P Q (C) and conditional on X, Y = Frag — PD(q, 1 — a)(X, ■). 
(ii) Y ~ Pq(7i/ q + C) an d conditional on Y, X ~ 0^ i- a — Coag(Y, •). 

WTiere in particular, forY = (p^L ), indicating its a~ DIVERSITY or local 
time has value v~ a , 

"(^.D^? - (^((Pf 5 ).') = ^ " (Coag((^),-). 

a 11/ a ' S 

Which is described in [(Hi) J of Theorem 4.3. 
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